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1  Introduction 

Microblog  track  was  first  introduced  in  2011  and  we  have  participated  in  this  task  for  4  years  11,2  . 
This  year’s  microblog  track  has  two  tasks.  The  first  one,  namely  ad-hoc  search  task,  is  the  same  as 
usual.  This  task  needs  to  retrieve  all  the  tweets  that  are  relevant  to  query  Q  before  time  T.  Participants 
can  access  the  corpus  by  official  APIs.  The  second  task  is  Tweet  Timeline  Generation(TTG)  task.  It  is 
newly  introduced  this  year  and  the  main  goal  of  it  is  to  detect  and  remove  the  redundant  tweets  the  first 
task  retrieves. 

This  report  is  organized  as  follows.  Section  2  mainly  focuses  on  the  data  preparation.  Section  3  is 
our  methodology  and  framework  of  the  ad-hoc  search  task.  Section  4  focuses  on  the  methodology  of 
TTG  task.  Section  5  gives  the  final  results  of  the  two  tasks. 

2  Data  Preparation 

The  twitter-tools[3]  was  downloaded  from  github.  By  using  it  we  can  interact  with  the  service  API 
to  download  the  original  tweets  of  each  topic  of  each  year  in  2011,  2012,  2013  and  2014.  We  retrieved 
10000  tweets  for  each  query  and  stored  them  in  separated  file.  We  also  stored  the  tweets  of  2011,  2012 
and  2013,  since  they  were  used  as  training  data  for  our  supervised  framework. 

We  also  used  the  TweetAnlyzer,  the  official  supplied  to  do  stemming  and  split  the  tweets  into 
words.  Since  it  has  been  shown  that  stop  words  removal  might  have  a  negative  impact  on  the  final 
ranking  results,  we  didn’t  remove  the  stop  words. 

3  Ad-hoc  Task  Methodology 

We  define  this  task  to  be  a  re-rank  problem.  We  have  already  downloaded  the  tweets  of  topics  in 
2011,  2012,  2013  and  2014.  We  use  learning  to  rank  model  to  do  this  re-rank  problem.  First  we  need  to 
extract  some  features  about  one  document  and  one  query.  We  use  the  tool  SVMrank  [4]  and  the  data  of 
2011  and  2012  which  we  have  already  known  the  relevance  to  train  the  model.  Namely,  we  use  data  in 
2011  and  2012  as  train  corpus  and  use  data  in  2013  as  test  corpus.  At  last  we  use  all  the  data  in 
2011-2013  as  train  corpus  to  train  the  model  and  use  this  model  to  predict  the  data  in  2014.  Besides  the 
features  used  in  our  previous  work [5],  we  further  consider  features  computed  by  the  following  methods. 

3.1  Query  Expansion 

We  use  Bol  model[11]  to  get  query  expansion  words.  And  we  picked  the  top-k  documents  in  one 
topic  and  use  them  to  produce  the  expansion  words.  Every  word  t  in  the  set  has  a  weight  w,  and  it  is 

given  by  (l):w(t)  =  tf  *  log^-+  log  (1  +  p)  (1) 

F 

where  //  is  the  frequency  of  t  in  top-ranked  documents  and  p  is  given  by  —  where  F  represent  the  term 

frequency  of  /  in  the  whole  corpus  and  N  represents  the  total  number  of  documents  in  the  corpus.  Bol 
model  can  not  only  be  used  to  give  query  expansion  words  for  BM25  score  computing,  but  also  can  be 
incorporated  into  the  language  model. 

3.2  Word  Vector 
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We  also  use  the  word2vec [6]  to  get  a  feature  value.  We  set  the  dimension  of  each  vector  of  word  to 
200.  For  a  query,  the  vector  is  computed  by  summing  up  the  vector  of  each  word’s  word  vector 
weighted  by  the  word’s  tf-idf.  For  the  document,  we  do  the  same  thing.  Finally,  we  compute  the  cosine 
similarity  of  the  two  vectors  and  use  this  value  as  a  feature. 

3.3  Language  Model 

Besides  the  Bol  model,  we  also  used  a  mixture  model  to  estimate  query  language  model  which 
regards  a  document  as  a  mixture  of  theme [7]  It  can  be  shown  as: 

k 

PdM  =  ABp(w\6B)  +  (1  -dB)^[7Td/p(w|e;)] 

7  =  1 

m  k 

[< c(w,d )  x  log  (ABp(w\eB~)  +  (1  -TB)^7rd/p(w|6>;)))] 

i=l  dectwev  ;'=  l 

We  use  uses  Maximum  Likelihood  Estimation  (MLE)  with  smoothing  to  learn  tweet  language 
model.  The  smoothing  methods  used  are  Jelinek-Mercer,  Dirichlet,  and  Absolute  Discounting.  Finally, 
we  use  the  KL  divergence  between  the  query  and  tweet  language  models  to  measure  the  relevance  of 
the  tweet  to  the  topic. 

3.4  BM25  model  with  term  proximity 

The  most  commonly  used  retrieval  model  is  BM25.  But  BM25  model  doesn’t  take  care  of  the 
term  proximity.  So  we  proposed  the  minimum  window  BM25[8].  The  main  idea  of  the  method  is  that  if 
all  terms  of  query  appear  in  a  small  area,  it’s  more  likely  to  be  relevant.  There  is  another  variant  of 
BM25  named  BM25PF  |y|  that  we  use.  It  also  considers  the  term  proximity.  It  combines  the  phrase 
frequency  information  with  the  basic  bm25  model  to  rank  the  documents. 

4  TTG  Task  Methodology 

We  apply  two  clustering  methods  on  the  Tweets  return  by  ad-hoc  retrieval  system  to  capture  the 
Timeline  summary  of  certain  relevant  information.  Single-Pass  method  and  Affinity  Propagation  (AP) 
method  are  chosen  for  the  sake  of  both  performance  and  speed. 

Single-Pass  Clustering  needs  only  one-time  traversal  of  all  the  Tweets.  Each  new  Tweet  is 
compared  with  every  formal  Tweets  in  every  cluster  in  Tweets  similarity.  If  the  largest  similarity  is 
larger  than  Similarity  Threshold,  the  new  Tweet  is  put  in  the  cluster  where  the  corresponding  Tweet’s  is. 
If  none  of  the  similarity  is  larger  than  the  thr  eshold,  the  new  Tweet  is  put  in  a  new  cluster  which  only 
contains  itself  now.  The  process  ends  until  all  the  Tweets  are  put  into  clusters.  The  final  clusters  are  the 
result  of  TTG  Task. 

AP  Clustering  is  the  state-of-art  clustering  method.  This  method  maintain  the  Responsibility  and 
Availability  matrices  which  represent  how  well-suited  Tweet  A  is  regarded  as  Tweet  B’s  exemplar  and 
Tweet  B  is  regarded  as  the  follower  of  Tweet  A.  The  matrices  are  carefully  modified  in  each  iteration 
until  convergence.  The  final  clusters  are  the  result  of  TTG  Task.  Besides,  this  algorithm  can  also  result 
in  the  exemplars  of  each  cluster,  which  means  the  most  representative  Tweets  of  all  clusters  can  be 
presented  by  this  algorithm. 

5  Experiments  Results 

For  the  first  ad-hoc  task,  we  submitted  4  inns.  ICTNETRUN1  uses  all  the  features  mentioned 
above.  ICTNETRUN2  doesn’t  use  features  generated  by  language  model.  ICTNETRUN3  doesn’t  use 
the  features  generated  byword  vector.  ICTNETRUN4  doesn’t  use  the  feature  ofBM25PF  score. 

We  set  the  parameter  of  SVM  model  to  0.3  and  use  the  data  in  2011,  2012  and  2013  as  training  set 


logp(C\A)  =  III 


to  train  the  model.  After  that,  we  use  the  model  to  predict  the  rank  of  data  in  2014.  Finally,  the  result  is 
shown  in  Table  1 . 

Table  1.  Evaluation  results  for  ICTNET  submitted  runs  of  task  1. 


Run  tag 

R-Prec 

MAP 

P@30 

ICTNETRUN1 

0.4017 

0.3534 

0.5800 

ICTNETRUN2 

0.3734 

0.3062 

0.5109 

ICTNETRUN3 

0.4411 

0.4139 

0.6212 

ICTNETRUN4 

0.4369 

0.4141 

0.6242 

For  the  second  task,  the  result  is  shown  in  Table  2. 

Table  2  Evaluation  results  for  ICTNET  submitted  runs  of  task  2. 


Run  tag 

unweighted_recall 

weighted_recall 

precision 

ICTNETAP3 

0.2234 

0.4623 

0.1792 

ICTNETAP4 

0.2528 

0.4836 

0.1702 

ICTNETRUNSP3 

0.2921 

0.3959 

0.1054 

ICTNETRUNSP4 

0.3410 

0.4868 

0.1029 
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